Skip to content

Handle win32 separator for cygwin paths #141864

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Berrysoft
Copy link
Contributor

@Berrysoft Berrysoft commented Jun 1, 2025

This PR handles a issue that cygwin actually supports Win32 path, so we need to handle the Win32 prefix and separaters.

r? @mati865

cc @jeremyd2019

Not sure if I should handle the prefix like the windows target... Cygwin does support win32 paths directly going through the APIs, but I think it's not the recommended way.

Here I just use cygwin_conv_path because it handles both cygwin and win32 paths correctly and convert them into absolute POSIX paths.

UPDATE: Windows path prefix is handled.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jun 1, 2025
@jieyouxu jieyouxu added the O-cygwin Target: *-pc-cygwin label Jun 1, 2025
@rust-log-analyzer

This comment has been minimized.

Copy link
Contributor

@jeremyd2019 jeremyd2019 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I just use cygwin_conv_path because it handles both cygwin and win32 paths correctly and convert them into absolute POSIX paths.

I was going to raise the issue of encoding, but looking at the code it doesn't seem to deal with WIN_A as a different encoding than POSIX paths, so I guess that's not an issue. I guess I need to have a conversation on the Cygwin list to better understand this.

@Berrysoft
Copy link
Contributor Author

Finally I reuse the prefix parser from Windows. Now the cygwin Path can handle Win32 path correctly. It should behave a little different as it allows / as verbatim separater. Here is a simple test code:

use std::path::{absolute, Path};

fn main() {
    test_path("C:\\msys64");
    test_path("/dev/sda1");
    test_path("./sda1");
    test_path("/c/Users");
    test_path("\\\\?\\UNC/localhost/share");
}

fn test_path(p: impl AsRef<Path>) {
    let p = p.as_ref();
    let abs = absolute(p).unwrap();
    if p.is_absolute() {
        println!("{} == {}", p.display(), abs.display());
    } else {
        println!("{} -> {}", p.display(), abs.display());
    }
    println!("{:?}", p.components().collect::<Vec<_>>());
}

On Windows, it should output (the current drive is D:)

C:\msys64 == C:\msys64
[Prefix(PrefixComponent { raw: "C:", parsed: Disk(67) }), RootDir, Normal("msys64")]
/dev/sda1 -> D:\dev\sda1
[RootDir, Normal("dev"), Normal("sda1")]
./sda1 -> <<<current dir>>>\sda1
[CurDir, Normal("sda1")]
/c/Users -> D:\c\Users
[RootDir, Normal("c"), Normal("Users")]
\\?\UNC/localhost/share == \\?\UNC/localhost/share
[Prefix(PrefixComponent { raw: "\\\\?\\UNC/localhost/share", parsed: VerbatimUNC("localhost/share", "") })]

While on Cygwin, it should output

C:\msys64 == /c/msys64
[Prefix(PrefixComponent { raw: "C:", parsed: Disk(67) }), RootDir, Normal("msys64")]
/dev/sda1 == /dev/sda1
[RootDir, Normal("dev"), Normal("sda1")]
./sda1 -> <<<current dir>>>/sda1
[CurDir, Normal("sda1")]
/c/Users == /c/Users
[RootDir, Normal("c"), Normal("Users")]
\\?\UNC/localhost/share == //?/UNC/localhost/share
[Prefix(PrefixComponent { raw: "\\\\?\\UNC/localhost/share", parsed: VerbatimUNC("localhost", "share") })]

@Berrysoft
Copy link
Contributor Author

@jeremyd2019

I was going to raise the issue of encoding, but looking at the code it doesn't seem to deal with WIN_A as a different encoding than POSIX paths, so I guess that's not an issue. I guess I need to have a conversation on the Cygwin list to better understand this.

Actually rust doesn't handle very well for non-UTF-8 encoding on Unix. OsStr and Path are all UTF-8 (because they could be constructed from a UTF-8 str without copy) and they are passed to OS APIs directly without encoding conversion. The only concerns, I think, should be command line input and console input. Do you know that whether Cygwin handles the encoding conversion of argv and stdin? I hope it does:)

@Berrysoft
Copy link
Contributor Author

OK, seems that the command line is converted from UTF-16: https://github.com/cygwin/cygwin/blob/972872c0d396dbf0ce296b8280cee08ce7727b51/winsup/cygwin/dcrt0.cc#L902

And stdin handles the encoding conversion: https://github.com/cygwin/cygwin/blob/972872c0d396dbf0ce296b8280cee08ce7727b51/winsup/cygwin/fhandler/pty.cc#L3935

So I think all inputs are UTF-8 for a Cygwin program, and the developers could not construct a non-UTF-8 Win32 path if they aren't aware of encodings. If they are, they should know that Rust codes are UTF-8:)

@Berrysoft
Copy link
Contributor Author

Now I'm a little worried about cygwin_conv_path then. If the maintainers someday find that it should handle code page encodings, the code here will be messed up:(

@pietroalbini
Copy link
Member

This was originally reported by @Ry0taK through Rust's security disclosure process, and since this affects a still-in-development tier 3 target (with very few users right now) we agreed to let the development of the fix happen in the open.

@mati865
Copy link
Member

mati865 commented Jun 2, 2025

This change is T-libs area, so it will require their review.

r? rust-lang/libs

I'm not familiar with Cygwin API at all, but I see what you want to achieve here, and I think it's sensible.

Now I'm a little worried about cygwin_conv_path then. If the maintainers someday find that it should handle code page encodings, the code here will be messed up:(

They wouldn't want to break that API as a normal release, right? I'd expected a documented API to remain backwards compatible.

@pietroalbini I'm out of the loop, could you shed some light on your comment?

@rustbot rustbot assigned tgross35 and unassigned mati865 Jun 2, 2025
@mati865
Copy link
Member

mati865 commented Jun 2, 2025

Also, there is a typo in both the title and the first commit 😉

@Berrysoft Berrysoft changed the title Handle win32 separater for cygwin paths Handle win32 separator for cygwin paths Jun 2, 2025
@jeremyd2019
Copy link
Contributor

So I think all inputs are UTF-8 for a Cygwin program, and the developers could not construct a non-UTF-8 Win32 path if they aren't aware of encodings. If they are, they should know that Rust codes are UTF-8:)

Cygwin's encoding may not always be UTF-8 - one can set the locale environment variables like they can on Unix to set the encoding, and then the conversions of paths and terminal input will use that encoding instead of UTF-8.

@Berrysoft
Copy link
Contributor Author

In such locale, Rust doesn't even work on Linux. That's fair:)

@pietroalbini
Copy link
Member

pietroalbini commented Jun 3, 2025

@pietroalbini I'm out of the loop, could you shed some light on your comment?

@mati865 this was originally reported to security@rust-lang.org as it could defeat path traversal mitigations like this:

if p.components().into_iter().any(|x| x == Component::ParentDir) {
    panic!("path traversal");
}

By only parsing unix-like paths, it would be possible to sneak in a path traversal using win32 paths. Given the current status of the target we agreed it would be ok to develop the fix in the open.

@Berrysoft
Copy link
Contributor Author

@tgross35 ping?

Copy link
Contributor

@tgross35 tgross35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty reasonable to me, a few small comments/questions

@tgross35
Copy link
Contributor

@pietroalbini I'm out of the loop, could you shed some light on your comment?

@mati865 this was originally reported to security@rust-lang.org as it could defeat path traversal mitigations like this:

if p.components().into_iter().any(|x| x == Component::ParentDir) {
    panic!("path traversal");
}

By only parsing unix-like paths, it would be possible to sneak in a path traversal using win32 paths. Given the current status of the target we agreed it would be ok to develop the fix in the open.

@Berrysoft could you add a regression test to this effect that fails without this change? Probably in library/std/tests/path.rs.

@Berrysoft Berrysoft requested a review from tgross35 June 10, 2025 07:59
Copy link
Contributor

@tgross35 tgross35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One test request then lgtm with a squash

@Berrysoft Berrysoft force-pushed the cygwin-path branch 2 times, most recently from a15c379 to b6ad7ff Compare June 10, 2025 17:12
@jeremyd2019
Copy link
Contributor

Should there be some testing of // prefixes, like //?/C:/foo/bar and //server/share/foo.txt?

@Berrysoft Berrysoft force-pushed the cygwin-path branch 2 times, most recently from dfea730 to db6d165 Compare June 12, 2025 02:49
@Berrysoft
Copy link
Contributor Author

Tested on Cygwin and I found that Cygwin even treats //./?/C:/msys64/../../../?/D:/msys64 equal to D:\msys64. So Cygwin really thinks that's a POSIX path. I think I should follow it. The path without driver letter or //, and not containing \, is a POSIX path even it contains /?/ and could be recognized finally.

@jeremyd2019
Copy link
Contributor

OK, I think you're into stuff that matters more about what the rust apis are supposed to do rather than what Cygwin does... If the prefix parser is meant for determining "roots you can't .. your way out of" vs "an alternate namespace starts here". I bet it's the former, because otherwise it would want to deal with mount points on unix. Basically a native rust implementation of PathCchStripToRoot

@ChrisDenton
Copy link
Member

I'm a bit confused about the current conversation, can anyone clarify?

It was my understanding that cygwin paths are meant to be fully posix, no? If so then either there's a bug in cygwin or in the std library implementation on cygwin. Or am I misunderstanding something?

@ChrisDenton
Copy link
Member

According to the posix specification, paths can have one of two roots: / or // but the latter is optional and most treat it the same as the former.

A pathname consisting of a single shall resolve to the root directory of the process. A null pathname shall not be successfully resolved. If a pathname begins with two successive characters, the first component following the leading characters may be interpreted in an implementation-defined manner, although more than two leading characters shall be treated as a single character.

@jeremyd2019
Copy link
Contributor

jeremyd2019 commented Jun 12, 2025

Here's what a comment at the top of Cygwin's path.cc has to say:

     Pathnames are handled as follows:

     - A \ or : in a path denotes a pure windows spec.
     - Paths beginning with // (or \\) are not translated (i.e. looked
       up in the mount table) and are assumed to be UNC path names.

     The goal in the above set of rules is to allow both POSIX and Win32
     flavors of pathnames without either interfering.  The rules are
     intended to be as close to a superset of both as possible.

I think the : in a path really means letter followed by : to start a path. We saw a : in a path //./C:/ not be treated the same as the same path with a \ in it.

The docs seem to describe it better: https://cygwin.com/cygwin-ug-net/using.html#pathnames-win32

@ChrisDenton
Copy link
Member

Thanks.

So from those docs it looks like the intent is that a path is either pure windows or pure posix? Deciding which one a path is requires both looking at the root and scanning the whole path for the presence of \.

@jeremyd2019
Copy link
Contributor

jeremyd2019 commented Jun 12, 2025

That matches what the code of Cygwin's normalize_posix_path function does. If it starts with alpha : (/ or \) or contains \ it calls normalize_win32_path instead (and then turns all \ into / for posix callers)

@Berrysoft
Copy link
Contributor Author

I think this PR is ready now. @jeremyd2019 would you take a look?

@Berrysoft
Copy link
Contributor Author

OK, I think you're into stuff that matters more about what the rust apis are supposed to do rather than what Cygwin does... If the prefix parser is meant for determining "roots you can't .. your way out of" vs "an alternate namespace starts here". I bet it's the former, because otherwise it would want to deal with mount points on unix. Basically a native rust implementation of PathCchStripToRoot

Yes, it's the former. That's why I finally think the prefix of //./C:/-style POSIX path is None.

@tgross35
Copy link
Contributor

Because it involves some nuances of Windows paths, I think it would be best for Chris to do the final review for this rather than me.

r? @ChrisDenton

Copy link
Contributor

@jeremyd2019 jeremyd2019 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the tests seem consistent with Cygwin's path handling.

@Berrysoft
Copy link
Contributor Author

Should we get PWD to determine whether a path starting with \ is absolute? (That's awful, I think.)

https://inbox.sourceware.org/cygwin/35feb8a2-17eb-fcc8-48dc-58166f708b1a@jdrake.com/

@jeremyd2019
Copy link
Contributor

I don't think so. There's no way that's intended behavior.

Copy link
Member

@ChrisDenton ChrisDenton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. I just have one small issue but otherwise I think this is good to merge.

@ChrisDenton
Copy link
Member

Thanks for working on this!

@bors r+

@bors
Copy link
Collaborator

bors commented Jun 13, 2025

📌 Commit 568a3ad has been approved by ChrisDenton

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-cygwin Target: *-pc-cygwin S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants